Lessons Learned when Comparing Shared Memory and Message Passing Codes on Three Modern Parallel Architectures

نویسندگان

  • J. M. MacLaren
  • J. Mark Bull
چکیده

A serial Fortran 77 micromagnetics code, which simulates the behaviour of thin-lm media, was parallelised using both shared memory and message passing paradigms, and run on an SGI Challenge, a Cray T3D and an SGI Origin 2000. We report the observed performance of the code, noting some important eeects due to cache behaviour. We also demonstrate how certain commonly-used presentation methods can disguise the true performance proole of a code.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Experiments with Cholesky Factorization on Clusters of SMPs

Cholesky factorization of large dense matrices is an integral part of many applications in science and engineering. In this paper we report on experiments with different parallel versions of Cholesky factorization on modern high-performance computing architectures. For the parallelization of Cholesky factorization we utilized various standard linear algebra software packages and present perform...

متن کامل

Combining Message-passing and Directives in Parallel Applications

Developers of parallel applications can be faced with the problem of combining the two dominant models for parallel processing—distributed-memory and shared-memory parallelism—within one source code. In this article we discuss why it is useful to combine these two programming methodologies, both of which are supported on most high-performance computers, and some of the lessons we learned in wor...

متن کامل

An interactive environment to assist in the parallelisation of Fortran application codes

Introduction The cost in porting applications to high performance parallel computers still remains a very expensive effort. The shared memory and distributed memory programming models are two of the most popular models used to transform existing serial application codes to a parallel form. Despite the error-prone and costly effort involved in the parallelisation process, the use of message pass...

متن کامل

Compiling MPI for Many-Core Systems

Processors with multiple (or many) cores and shared memory are becoming ubiquitous across the computing spectrum. MPI, the current de facto programming model for scalable parallel applications, enforces copies between source and target processes and thus can not fully utilize shared memory and cache architectures of modern machines. To enable MPI-based programs to more fully exploit features of...

متن کامل

Parallel Implementation of Computational Fluid Dynamics Codes on Emerging Architectures

We consider two emerging parallel computing platforms and their suitability for large-scale computational mechanics codes. CRAY Multi-Threaded Architecture (MTA), featuring custom CPUs and automatic parallelizing compiler suite, is organized around flat uniform access shared memory, high-bandwidth connections between CPUs and memory, lightweight synchronization, and context switching between th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998